As a group of scientists, our team is curious about how humans can affect the natural environment. We wondered if weather conditions in a highly populated region might correlate with notable human activities on a global or local scale.

There is a variety of literature available on global climate change, suggesting that the Earth’s surface temperature is increasing, on average, over time in a way that threatens many lifeforms on our planet. This warming is primarily due to human activity: the combustion of fossil fuels resulting in the emission of carbon dioxide into the atmosphere. We decided to consider local weather trends in a specific urban area to see what we might learn.

After some search, we discovered that historical daily weather records from locations around the United States are made publicly availably by the National Oceanic and Atmospheric Administrative via their website: https://www.ncdc.noaa.gov/cdo-web. We found that a data station in the middle of Central Park in New York City has made more than 56,000 daily weather observations dating back to 1869. The variables observed included daily maximum temperature in degrees Fahrenheit (TMAX), daily minimum temperature in degrees Fahrenheit (TMIN), and daily precipitation (PRCP) and snowfall (SNOW) in inches. We decided to analyze these data to see what trends we might uncover. Our data span the days from February 28, 1869, to September 26, 2022.

Temperature data from Central Park have been studied in the past, and warming trends were observed; we wanted to see what we might discover on our own, from analysis of the original data.

Our Research Questions

Given that Central Park is an oasis in the middle of a city environment, we wondered whether any weather trends correlate with major human activities– both globally and locally. To explore these possibilities, we formulated the following preliminary questions:

  1. Are there statistically measurable changes in weather patterns (e.g., temperature and precipitation levels) over time in New York City?
  2. Do temperature trends in New York City align with the documented rise in average global temperatures over the last century?
  3. Are there any notable (and statistically significant) changes in weather patterns after/during the pandemic lockdown (perhaps due to changes in commuting patterns)?

To dig in, we began with temperature.

Emily’s linear model Linear Models of Temperature over Time

We then created a linear model of TMAX vs. year to understand the temperature trends since 1900. The fit parameters were statistically significant, and suggested that both the maximum and minimum daily temperatures in New York’s Central Park have increased on average over time at a rate of approximately 0.026 degrees per year. While the p-value for this parameter is < 2e-16 (well below the threshold of alpha = 0.05), this overall fit is poor, with an adjusted r-squared value of 0.00245.

The poor fit is likely due to the wide range of daily temperatures that occur in a given year as a result of seasonal variation. The following plot of daily maximum temperatures shows the wide variance of the data around the linear model.

In order to improve the fit and model temperature trends more completely, we decided to account for seasonal variation by also including month as a categorical regressor. The resulting fit has an r-squared value of 0.775 and a slope of 0.025 degrees Fahrenheit per year, with all fit parameters’ p-values well below 0.05. The different intercepts for the each level of the categorical variable (the twelve months of the year) indicate that January is the coldest and July the hottest month in Central Park, with an average difference in maximum daily temperature of approximately 46 degrees Fahrenheit in any given year over this window.

These two extremes and their linear models are plotted in the following figure; it is clear that the multiple regression is a much better model of temperature trends, consistent with the higher r-squared value.

To create an even better model, we’d need to use sinusoidal functions that capture the cyclical variation of weather with season; this would take us into scientific data analysis and time series modeling and is a topic for future consideration.

Consideration of another New York location

We wondered whether these trends were true for other locations in the New York City area. To assess this, we found data from another NOAA station at JFK International Airport. Because these data date only as far back as the Airport (which was built in 1948), we focused on 1948 on, computing linear models for both regions for this time window.

We were suprised to notice that the slope of the Central Park model for 1948 on was lower than that including observations from 1900 on; only 0.014 degrees per year compared to 0.025. This suggests that average Central Park warming was greater in the first half of the 20th century than in the second half– which is not what we would intuit based on the understanding that the global rate of warming is increasing.

We also found a higher warming rate at the JFK airport site, of approximately 0.033 degrees Fahrenheit per year.

We were surprised to notice that the slope of the Central Park model for 1948 on was lower than that including observations from 1900 on; only 0.014 (r-squared = 0.771, all p << 0.05) degrees per year compared to 0.025 (r-squared = 0.773, all p << 0.05). This suggests that average Central Park warming was greater in the first half of the 20th century than in the second half– which is not what we would intuit based on the understanding that the global rate of warming is increasing.

We also found a higher warming rate at the JFK airport site, of approximately 0.033 degrees Fahrenheit per year.

To see whether the different warming rates in Central Park post-1900 and post-1948, and at JFK airport are real, we examined the 95 percent confidence intervals associated with each of the three slopes.

The confidence intervals for the slope of the three models does not overlap, suggesting that the warming rates are substantially different in the three models. This suggests that:
1. Temperature trends in Central Park are not strictly linear between 1900 and 2022.
2. The rate of warming at JFK airport is actually greater than that in Central Park between 1948 and 2022.

We hypothesize that these trends could be related to rate of development in the areas in question, as concrete can hold more heat than a non-built environment. To test this hypothesis, we could look for other data sets for these locations over the same time period that include some measure of construction.

Non-time-dependent relationships between weather variables

As we move forward with these data, we will further explore relationships between weather variables. For a preview of how this might look, we created a simple correlation plot of all numeric variables for Central Park since 1900.

Preliminarily, it appears that the slight correlation between year and maximum and minimum daily temperatures is reflected here, and that snow has an inverse correlation with the temperature variables. Bringing month back in as a categorical variable might be interesting here. We also note that there seems to be an inverse correlation between average daily temperature and year. However, the plot is only comparing pairwise relationships for which there are data– and not every day has data for average daily temperature, so these data may not be appropriately representative.

Lockdown effects on Temperature

The last question our team wanted to address was to understand the changes in weather patterns that may be associated with the COVID-19 lockdown. The COVID-19 lockdown had major social, economic, and political impacts in 2020. The lockdown in New York City in Spring of 2020 was one of the earliest in effect and saw unprecedented traffic and life-pattern changes to those who visited and worked in NYC daily.Our team set out to see if these major changes to the city were noticeable in the weather patterns at the time.

To do this, the average daily maximum temperatures for spring (considered April and May) and summer (considered June, July and August) were compared using the years leading up to the pandemic and the months following the lockdown order (Figure C4).

A t-test was performed to compare the means of the pre-lockdown and post-lockdown maximum daily temperatures (Table C2). For the summer lockdown months, the summer following lockdown appeared to be warmer on average. However, this was not a statistically significant difference in the average maximum temperatures.

The spring lockdown months showed a statistically significant difference in the means pre- and post- lockdown order. The post-lockdown months were significantly cooler, with a mean spring temperature of 63.5°F, than the years preceding the COVID-19 pandemic, which had a mean spring temperature of 67.3°F. This is an interesting finding as similar studies found a decrease in the day Land Surface Temperature during COVID-19 lockdown (Parida et al, 2021). The authors attribute the change in temperature to the change in aerosols in the air. The contrast in results warrants further study to explore this trend.

Conclusion- terse summary of findings, restate new questions from all sections, ideas for future analyses if you give us more money.

If we have time, summarize what we learned from each variable, and new questions?

References